Better Regex

Write Multi-line regex with comments, special functions, and reference lookup.

Under Development

This is under active development. The ::combine function does not work. Comments & multi-line work. ::ref SHOULD work. Though they need documentation. You can also look at the tests.

It's not totally ready to use. Could still see some major changes.

Features

  • Multi-line regex
    • You need to escape beginning of line & and end of line spaces (\ ), or they'll be removed
  • Comments in regex ( # a comment)
    • requires a single space before #.
    • If you have a space then hash ( #) in your regex, escape the hash: \#
  • Reference lookups (::ref({{lookupKey}}))
    • This needs new tests & to be documented
  • (coming sometime) Combine multiple references only if selected
    • ::combineSelected({{key1}}{{key2}}{{key3}} ;; | ) to combine selected keys with pipe (|)
  • (coming sometime) Extend with your own functions

Note

  • the PCRE_EXTENDED flag can be specified to remove newlines & comments in PHP normally.
    • Of course, that does't enable the reference lookups provided by this library.
    • This lib uses it's own parsing to remove comments and handle escaped spaces at the end of newlines
    • I didn't know about PCRE_EXTENDED until I was just about done with the parser here... & I think it could still be helpful in some cases, like using this lib to parse down a complex regex and then use the 1-line regex in a language that doesn't allow comments & multi-line

TODO

  • Finish writing tests
  • Test and document ::ref
  • Fix ::combine function & disable selection
  • Add ::combineSelected function to select from the list
  • write install instructions
  • improve data-loading...
    • Add feature to load regex source from a directory (currenlty it's a bunch of manual requires...)

Also TODO, but less crucial atm

  • Move the complex test example below into an 'Advanced.md' file & write a simpler test for demonstration
  • Re-write the test for reference lookups & include that test as an example.
  • Add reature to load regex source in different formats
  • Add extensibility features
  • Notes for contributors
  • Get Feedback & adapt

Example, no reference-lookups

Source Regex (multi-line, comments)

<<<REGEX  
One  
#a comment at the start  
## Another start of line comment  
#  
##  
\#\   
\  
\   
    eol_esc_slash\\\\\\\\\\\\  
    eol_space_test\\\\\\\\\           
    abc\ #Should this be a comment? It IS, only because making it NOT a comment is much more complicated & harder to communicate.  
    def\  #this IS a comment  
    ghi\ \#this is not a comment  
    jkl\\\\ #this IS a comment  
        #  
    \ Two # Am a comment  
    ( ( # Am another comment  
        \#Three # This seems like a lot of comments  
		#[0-9] # You need escape your # lie \# to use a space + hash as not-a-comment  
    ))?  
REGEX,  

Compiles into

  1. Because of backslash escaping in strings in PHP, every pair (\\) will be reduced into a single \, so this example output has almost twice as many \ as it actually does in production
  2. The single quotes (') surrounding it would not be there in production. That's just because this example is exported from the test file & over there, it has to be wrapped in quotes, of course, because it's a string!
'One\\#\\ \\ \\ eol_esc_slash\\\\\\\\\\\\eol_space_test\\\\\\\\\\ abc\\ def\\ ghi\\ \\#this is not a commentjkl\\\\\ Two( (\#Three))?',  

Executed with

# Key not found for Example.Cleaning.Usage  



Example (obsolete)

This is still probably acccurate, but our examplse need to be exported from our tests & this one is hard-coded in the .md file. It still gives a sense of what this lib is for, though.

Considering the full example...
A BetterRegex string:

One  
    Two  
    ( ( # Look at me, I'm a comment.  
        (  
            ::ref({{NameOfRef}})  
        )  
        ( (  
            ::combineSelected({{apples}}{{pear}}{{melon}} ;; |)  
        ))  
    ))?  

If only apples & pear are selected, compiles into:

OneTwo( ((Three|Four)( ((A|a)pple(s)?|(P|p)ear(s)?))))?